15 research outputs found

    Gesture Assessment of Teachers in an Immersive Rehearsal Environment

    Get PDF
    Interactive training environments typically include feedback mechanisms designed to help trainees improve their performance through either guided- or self-reflection. When the training system deals with human-to-human communications, as one would find in a teacher, counselor, enterprise culture or cross-cultural trainer, such feedback needs to focus on all aspects of human communication. This means that, in addition to verbal communication, nonverbal messages must be captured and analyzed for semantic meaning. The goal of this dissertation is to employ machine-learning algorithms that semi-automate and, where supported, automate event tagging in training systems developed to improve human-to-human interaction. The specific context in which we prototype and validate these models is the TeachLivE teacher rehearsal environment developed at the University of Central Florida. The choice of this environment was governed by its availability, large user population, extensibility and existing reflection tools found within the AMITIES framework underlying the TeachLivE system. Our contribution includes accuracy improvement of the existing data-driven gesture recognition utility from Microsoft; called Visual Gesture Builder. Using this proposed methodology and tracking sensors, we created a gesture database and used it for the implementation of our proposed online gesture recognition and feedback application. We also investigated multiple methods of feedback provision, including visual and haptics. The results from the conducted user studies indicate the positive impact of the proposed feedback applications and informed body language in teaching competency. In this dissertation, we describe the context in which the algorithms have been developed, the importance of recognizing nonverbal communication in this context, the means of providing semi- and fully-automated feedback associated with nonverbal messaging, and a series of preliminary studies developed to inform the research. Furthermore, we outline future research directions on new case studies, and multimodal annotation and analysis, in order to understand the synchrony of acoustic features and gestures in teaching context

    Dyadic Movement Synchrony Estimation Under Privacy-preserving Conditions

    Full text link
    Movement synchrony refers to the dynamic temporal connection between the motions of interacting people. The applications of movement synchrony are wide and broad. For example, as a measure of coordination between teammates, synchrony scores are often reported in sports. The autism community also identifies movement synchrony as a key indicator of children's social and developmental achievements. In general, raw video recordings are often used for movement synchrony estimation, with the drawback that they may reveal people's identities. Furthermore, such privacy concern also hinders data sharing, one major roadblock to a fair comparison between different approaches in autism research. To address the issue, this paper proposes an ensemble method for movement synchrony estimation, one of the first deep-learning-based methods for automatic movement synchrony assessment under privacy-preserving conditions. Our method relies entirely on publicly shareable, identity-agnostic secondary data, such as skeleton data and optical flow. We validate our method on two datasets: (1) PT13 dataset collected from autism therapy interventions and (2) TASD-2 dataset collected from synchronized diving competitions. In this context, our method outperforms its counterpart approaches, both deep neural networks and alternatives.Comment: IEEE ICPR 2022. 8 pages, 3 figure

    Pose Uncertainty Aware Movement Synchrony Estimation via Spatial-Temporal Graph Transformer

    Full text link
    Movement synchrony reflects the coordination of body movements between interacting dyads. The estimation of movement synchrony has been automated by powerful deep learning models such as transformer networks. However, instead of designing a specialized network for movement synchrony estimation, previous transformer-based works broadly adopted architectures from other tasks such as human activity recognition. Therefore, this paper proposed a skeleton-based graph transformer for movement synchrony estimation. The proposed model applied ST-GCN, a spatial-temporal graph convolutional neural network for skeleton feature extraction, followed by a spatial transformer for spatial feature generation. The spatial transformer is guided by a uniquely designed joint position embedding shared between the same joints of interacting individuals. Besides, we incorporated a temporal similarity matrix in temporal attention computation considering the periodic intrinsic of body movements. In addition, the confidence score associated with each joint reflects the uncertainty of a pose, while previous works on movement synchrony estimation have not sufficiently emphasized this point. Since transformer networks demand a significant amount of data to train, we constructed a dataset for movement synchrony estimation using Human3.6M, a benchmark dataset for human activity recognition, and pretrained our model on it using contrastive learning. We further applied knowledge distillation to alleviate information loss introduced by pose detector failure in a privacy-preserving way. We compared our method with representative approaches on PT13, a dataset collected from autism therapy interventions. Our method achieved an overall accuracy of 88.98% and surpassed its counterparts by a wide margin while maintaining data privacy.Comment: Accepted by 24th ACM International Conference on Multimodal Interaction (ICMI'22). 17 pages, 2 figure

    Social Visual Behavior Analytics for Autism Therapy of Children Based on Automated Mutual Gaze Detection

    Full text link
    Social visual behavior, as a type of non-verbal communication, plays a central role in studying social cognitive processes in interactive and complex settings of autism therapy interventions. However, for social visual behavior analytics in children with autism, it is challenging to collect gaze data manually and evaluate them because it costs a lot of time and effort for human coders. In this paper, we introduce a social visual behavior analytics approach by quantifying the mutual gaze performance of children receiving play-based autism interventions using an automated mutual gaze detection framework. Our analysis is based on a video dataset that captures and records social interactions between children with autism and their therapy trainers (N=28 observations, 84 video clips, 21 Hrs duration). The effectiveness of our framework was evaluated by comparing the mutual gaze ratio derived from the mutual gaze detection framework with the human-coded ratio values. We analyzed the mutual gaze frequency and duration across different therapy settings, activities, and sessions. We created mutual gaze-related measures for social visual behavior score prediction using multiple machine learning-based regression models. The results show that our method provides mutual gaze measures that reliably represent (or even replace) the human coders' hand-coded social gaze measures and effectively evaluates and predicts ASD children's social visual performance during the intervention. Our findings have implications for social interaction analysis in small-group behavior assessments in numerous co-located settings in (special) education and in the workplace.Comment: Accepted to IEEE/ACM international conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) 202

    Immersive Virtual Reality and Robotics for Upper Extremity Rehabilitation

    Full text link
    Stroke patients often experience upper limb impairments that restrict their mobility and daily activities. Physical therapy (PT) is the most effective method to improve impairments, but low patient adherence and participation in PT exercises pose significant challenges. To overcome these barriers, a combination of virtual reality (VR) and robotics in PT is promising. However, few systems effectively integrate VR with robotics, especially for upper limb rehabilitation. Additionally, traditional VR rehabilitation primarily focuses on hand movements rather than joint movements of the limb. This work introduces a new virtual rehabilitation solution that combines VR with KinArm robotics and a wearable elbow sensor to measure elbow joint movements. The framework also enhances the capabilities of a traditional robotic device (KinArm) used for motor dysfunction assessment and rehabilitation. A preliminary study with non-clinical participants (n = 16) was conducted to evaluate the effectiveness and usability of the proposed VR framework. We used a two-way repeated measures experimental design where participants performed two tasks (Circle and Diamond) with two conditions (VR and VR KinArm). We found no main effect of the conditions for task completion time. However, there were significant differences in both the normalized number of mistakes and recorded elbow joint angles (captured as resistance change values from the wearable sensor) between the Circle and Diamond tasks. Additionally, we report the system usability, task load, and presence in the proposed VR framework. This system demonstrates the potential advantages of an immersive, multi-sensory approach and provides future avenues for research in developing more cost-effective, tailored, and personalized upper limb solutions for home therapy applications.Comment: Submitted to International Journal of Human-Computer Interactio

    Multimodal Assessment Of Teaching Behavior In Immersive Rehearsal Environment - Teachlive\u3csup\u3eâ„¢\u3c/sup\u3e

    No full text
    Nonverbal behaviors such as facial expressions, eye contact, gestures, and body movements in general have strong impacts on the process of communicative interactions. Gestures play an important role in interpersonal communication in the classroom between student and teacher. To assist teachers with exhibiting open and positive nonverbal signals in their actual classroom, we have designed a multimodal teaching application with provisions for real-time feedback in coordination with our TeachLivE test-bed environment and its reflective application; ReflectLivE. Individuals walk into this virtual environment and interact with five virtual students shown on a large screen display. The recent research study is designed to have two settings (7-minute long each). In each of the settings, the participants are provided lesson plans from which they teach. All the participants are asked to take part in both settings, with half receiving automated real-time feedback about their body poses in the first session (group 1) and the other half receiving such feedback in the second session (group 2). Feedback is in the form of a visual indication each time the participant exhibits a closed stance. To create this automated feedback application, a closed posture corpus was collected and trained based on the existing TeachLivE teaching records. After each session, the participants take a postquestionnaire about their experience. We hypothesize that visual feedback improves positive body gestures for both groups during the feedback session, and that, for group 2, this persists into their second unaided session but, for group 1, improvements occur only during the second session

    Improving Social Communication Skills Using Kinesics Feedback

    No full text
    Interactive training environments typically include feedback mechanisms designed to help trainees improve their performance through guided or self-reflection. When the training system deals with human-to-human communications, as one would find in a teacher, counselor or cross-cultural trainer, such feedback needs to focus on all aspects of human communication. This means that, in addition to verbal communication, nonverbal messages (kinesics in particular) must be captured and analyzed for semantic meaning. The goal of this research is to introduce interactive training models developed to improve human-to-human interaction. The specific context in which we prototype and validate these models is the TeachLivEâ„¢ teacher rehearsal environment developed at the University of Central Florida. We implemented an online gesture recognition application on top of the Microsoft Kinect software development kit with multiple feedback channels including visual and haptics. In a study of twelve participants rehearsing a teaching session in TeachLivE, we found that the online gesture recognition tool and its associated feedback method are effective and non-intrusive approaches for the purpose of communication-skill training. The algorithms employed, the results, and the implications for other interactive contexts are discussed in this paper
    corecore